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The general stable quantum memory unit is a hybrid consisting of a classical digit with a quantum digit 
(qudit) assigned to each classical state. The shape of the memory is the vector of sizes of these qudits, which 
may differ. We determine when N copies of a quantum memory stf embed in N(l +o(l)) copies of another 
quantum memory 38. This relationship captures the notion that 88 is as at least as useful as si for all purposes 
in the bulk limit. We show that the embeddings exist if and only if for all p > 1, the p-norm of the shape of 
si does not exceed the p-norm of the shape of 88. The log of the p-norm of the shape of si can be interpreted 
as the maximum of 5(p) +H(p)/p (quantum entropy plus discounted classical entropy) taken over all mixed 
states p on si ' . We also establish a noiseless coding theorem that justifies these entropies. The noiseless coding 
theorem and the bulk embedding theorem together say that either si blindly bulk-encodes into 88 with perfect 
fidelity, or si admits a state that does not visibly bulk-encode into 88 with high fidelity. 

In conclusion, the utility of a hybrid quantum memory is determined by its simultaneous capacity for classical 
and quantum entropy, which is not a finite list of numbers, but rather a convex region in the classical-quantum 
entropy plane. 



1. INTRODUCTION 

Many questions in quantum information theory involve 
both quantum and classical information. The usual compu- 
tational model for such dual information is independent quan- 
tum and classical memory. The measurement algebra of a 
combined memory consisting of an a-state qudit and a Z?-state 
classical digit is 

b 
k=\ 

where *4% a is the set of ax a matrices. But this is not the 
most general possible hybrid of classical and quantum mem- 
ory. Rather the measurement algebra si of a finite memory 
could be any direct sum of matrix algebras of possibly differ- 
ent dimensions: 

k=\ 

The partition (i.e., non-negative integral vector) X = X{si) 
is a list of the dimensions of the matrix algebras called the 
shape of the memory si. Section |2] discusses why this is a 
reasonably general quantum memory model. 

For example, the simplest hybrid memory is a hybrid trit, 
with shape (2, 1). It consists of matrices of the form 
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This memory models a three-state system in which one state 
is observed by the environment but the other two remain co- 
herent relative to each other. It is easy to compare the capacity 
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of the hybrid trit to any other quantum memory: It is between 
a qubit and a qutrit, more than a classical trit, less than any 
larger memory that contains a qubit, and neither more nor less 
than a classical digit with at least 4 states. 

It turns out that there is more than one notion by which one 
memory unit has more capacity than another. (Atypically, all 
such notions are equivalent for the hybrid trit.) The strictest 
relevant relationship between memories is given by algebra 
embeddings. If si <^-> 38 is an algebra embedding (which 
need not be unit-preserving, or unital), then the memory 38 
can simulate the memory si . In other language, an algebra 
embedding is a blind, perfect-fidelity decoding. Section[2]also 
explains that although other blind, perfect-fidelity encodings 
are possible, any such encoding can be replaced by an algebra 
embedding. As Section IXT1 explains . the question of whether 
si embeds in 38 is a computable (but NP-hard) bin-packing 
problem. 

In this article we will consider a more relaxed comparison, 
namely whether many copies of si embed in slightly more 
copies of 38. More precisely we say that si bulk-embeds in 

38, or si c — > 38, if for every rational e > 0, there exists an N 
such that 

If si bulk-embeds in 38, there is no reason to pay more for 
si than 38 when buying large quantities of the two memories 
with equal performance. Our first main result is a characteri- 
zation of when si bulk-embeds in 38: 

Theorem 1.1. // si and 38 are two hybrid memories, then 
si <^-> 38 if and only if 

\\X{si)\\ p <\\X{33)\\ P 

for all p £ [1,°°]- 

One direction of Theorem ll.il is straightforward. The p- 
norm of a partition X is defined as 

/ \ Up 
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It is easy to check that the p-norm is multiplicative: 

\m^®M)\\p=\\H^)\\ P \\Hm\\p 

for any pair of memories s$ and S3. On the other hand the 
bin-packing model implies that if si£ embeds in @i, then 

\\X{s*)\\ p <\\X{<%)\\ v . 

It follows that this inequality also holds when s/ bulk-embeds 
in S3. The proof of the other direction of Theorem ll.ll is the 
topic of Section|5] 

The /?-norm has an interesting information-theoretic inter- 
pretation. In Section [4] we will define the classical entropy 
H(p) and the quantum entropy S(p) of a state p of a quan- 
tum memory s/. Their definitions are justified by a capacity 
estimate, Theorem 11.21 and by a noiseless coding theorem, 
TheoremlOl 



Theorem 1.2. Every state p of a memory s/ satisfies inequal- 
ity 



hap) 



-Mp)<iogPK)|| p , 



where p has classical entropy H(p) and quantum entropy 
Srf (p ). For each p >l there exists a p that achieves equality. 
Any non-negative pair (H,S) satisfying the inequality for all 
p can be expressed as 

(H,S) = (H Q/ (p)+t,SAp)-t) 
for some p and some t € [0, 1]. 
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Figure 1: The capacity region of a memory si with shape (2, 1, 1) 
and its 3-norm bounding line. 



is half of the dense coding capacity of si . 

Theorem ll.2l implies that the set of possible pairs 

where < t < 5 -c /(p), forms a convex capacity region C(si) 
in the first quadrant of the plane. Figure [2 shows an exam- 
ple. The constant t expresses the fact that quantum entropy 
can be used classically. Since the S-intercept of the line tan- 
gent to C(si) with slope — ^ is log ||A(^/)||,„ another way to 
state Theorem ll.ll is that memory si bulk-embeds in another 
memory S3 if and only if C(si) C C(S§). In other words, si 
bulk-embeds in S§ if and only if it has no state p with too 
much entropy to fit in S§. 

Our second main result is the following noiseless cod- 
ing theorem, which generalizes a result of Barnum, Hayden, 
Jozsa, and Winter jl|]. The terms of the theorem and a self- 
contained proof appear in Section l4~2l 

Theorem 1.3. Let si be a quantum memory with a state p and 
let S3 be another quantum memory. Then there is a reliable 
noiseless coding sequence 



for every rational £ > if and only if (H ss /(p),S J2 /(p)) G 
C(S§). Here "reliable" means that the complete fidelity 
F(p m , % n o %)^las N 

The "no-go" direction of Theorem 11.31 depends on an in- 
teresting Holder inequality for fidelity of encodings, Theo- 
rem l4.ll In simplified form, our inequality says that if 



,.c/ 



fid 



si 



are two quantum operations and j + ~ = 1, then 

Tr(%-o&)<\\l(£/)\\ q \\l(<%)\\ p . 

This inequality is a broad generalization of the following ele- 
mentary combinatorial fact: If a (uniformly) random number 
x from 1 to a is encoded into a random number from 1 to b 
with b < a and decoded back again, then the probability that 
x is recovered is at most |. 

In conclusion, Theorem 11.31 is an important converse to 
Theorem ll.il Together they say that if si and S§ are two 
hybrid quantum memories, then, then either si blindly bulk- 
encodes into S3 with perfect fidelity, or si has a state p that 
does not visibly bulk-encode into S§ with high fidelity. 



Note that the three most common p-norms are also signif- 
icant for quantum information theory. The logarithm of the 
1-norm, log||A(.s/)||i, is the purely classical capacity of si. 
The logarithm of the °°-norm, log||A(,e/)||oo, is the purely 
quantum capacity. And the logarithm of the 2-norm, 

.... ,, log dims/ 
log||A(*0|| 2 = -2-2 , 



2. MEMORY 

As explained in the introduction, the first question is 
whether our model of a hybrid memory is adequately gen- 
eral. One justification comes from viewing a quantum system 
not as a Hilbert space, but as an abstract operator algebra si. 
If si is infinite-dimensional, it should satisfy some analytic 
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axioms in order to be useful for quantum probability theory; 
usually it is assumed to be either a C*-algebra or a von Neu- 
mann algebra |8, 9]. But if it is finite-dimensional, it suffices 
to require that si be a (positive-definite) *-algebra; it is then 
also a C*-algebra and a von Neumann algebra. This means 
that in addition to the fact that si is a complex vector space 
with associative multiplication, it has an abstract ^-operation 
which is anti-linear, product-reversing, and suitably positive- 
definite: 

(AAA)* = JB*A* A* A = =>■ A = 0. 

Positive definiteness leads to an important partial ordering on 
si. By definition X>Y'\fX-Y= A* A for some A. 

For example, the matrix algebra is a *-algebra. 

Despite their abstraction, *-algebras have all of the neces- 
sary structure for quantum information theory. The elements 
of a ^-algebra s/ of the form A* A are called positive. A state 
p on a ^-algebra sf is defined as a dual vector p S si* which 
is positive on positive elements and which is normalized by 
p(I) = 1. Consequently we write p(A) for the expectation 
of A rather than Tr(pA). (The latter notation is of course 
equivalent when si is a matrix algebra; it expresses p as a 
density operator.) A quantum operation from a system with 
*-algebra si to a system with ^-algebra 98 is defined as a uni- 
tal, completely positive (UCP) linear map £ : si -^98. Here 
completely positive means that £ sends positive elements to 
positive elements after tensoring with the identity on a third 
*-algebra. Note that the transpose £ T : 98* — > sf* is the cor- 
responding map on states. It is completely positive and trace- 
preserving if we take p (/) to be the trace of p . 

It will be useful to consider a larger class of maps than 
traditional quantum operations. A completely positive map 
£ : si -> 98 is subunital (or SUCP) if £(l) < I. Whereas 
a UCP map conserves probability, an SUCP map either con- 
serves or diminishes it. An SUCP map can be physically re- 
alized in the same way as a UCP map, with the extra inter- 
pretation that missing probability corresponds to ending the 
experiment. An SUCP map can also be called a decay quan- 
tum operation. 

A standard classification theorem |3] says that every finite- 
dimensional *-algebra si is a direct sum of matrix algebras, 

n 

^ = ©A' 

k=l 

Thus a quantum memory of shape A is the most general pos- 
sible finite-dimensional complex algebra of observables satis- 
fying reasonable algebraic axioms. (However abandoning C 
as the field of scalars leads to other possibilities |s4j|.) 

Another justification comes from the interaction of a physi- 
cal memory with its environment. Consider a physical device 
whose state is defined by a ^-algebra Realistically ^ is 
very large, but almost all of it is thermally coupled to the envi- 
ronment. Its decoherence on the thermal time scale is given by 
some decay quantum operation £ : ^£ — » ^it '. If the thermal 
time scale is much shorter than the computational time scale, 
then the information retained by £ n in the limit n — > °° is the 
reliable memory of 



Certainly any finite-dimensional *-algebra si is the reliable 
memory retained by some quantum operation on a matrix al- 
gebra .Md- In the minimal construction, let d = | |A (si) 1 1 1 be 
the total size of all blocks of s/. We realize si C ^ d as ma- 
trices with a diagonal block of size Xk(si) for each k. The 
algebra .^j has a POVM whose Mi element P/, is the identity 
of the kth summand s/^. The corresponding quantum opera- 
tion 

&(A) = £ t P k AP k 

k=l 

is a projection, meaning 9^ 2 = and its image is si . If 
the thermal evolution of ,ii ( [ is given by 9P, the algebra si 
measures the retained information. 

Conversely, the following two results show that if £ is a 
(decay) quantum operation on a finite-dimensional *-algebra, 
the information retained by £" in the limit n — ► °° is mea- 
sured by a smaller ^-algebra of effective observables. (See 
also Zurek fll.') 

Theorem 2.1. Let £ : ^# — » ^# be an SUCP map on a finite- 
dimensional ^-algebra ^ ' . Then there exists a sequence of 
integers n\ — ► °° such that £ Hk converges to a unique projec- 
tion £?. 

Proof. (Sketch) Choose a basis of ~# that puts £ in Jordan 
canonical form. Since £ n is SUCP, its matrix entries are 
bounded. Therefore £ has no eigenvalues A with |A| > 1, 
and if |A| = 1, the A -isotypic part of £ is diagonal. Choose a 
sequence of exponents n# — > °° such that the phases of these 
diagonal entries of £" k are aligned with 1 in the limit. The 
rest of the matrix of £" decays to as n — > °°. The map is 
unique because if the phases do not align with 1, the limiting 
map is not a projection. □ 

Finally a result of Choi and Effros |5, pp. 166-7] completes 
our justification for the *-algebra model. 

Theorem 2.2 (Choi, Effros). If ^ii is a finite -dimensional 
^-algebra and 9? is an SUCP projection on then the im- 
age of 9? is a ^-algebra si with a modified product A o B = 
&>{AB). 

The non-trivial part of Theorem l2 . 21 (which more generally 
holds for C* -algebras) is the fact that the modified product 
A o B is associative. The modified product structure is consis- 
tent with applying £P between any two computational manip- 
ulations of ./#. Technically speaking, Choi and Effros prove 
Theorem l2.2l for UCP maps, but the proof for SUCP maps is 
the same. 

A quantum operation 3£ : 98 — > si is a blind, perfect- 
fidelity encoding if it has a right inverse W : si — > 98, which is 
then called the decoding. In this case the reverse composition 
*3f o X is a CPU projection & . Moreover, & identifies si 
with the Choi-Effros algebra structure on 9? . This construc- 
tion is reversible: Given 9?, we can define si to be \va9P 
with its Choi-Effros structure. Certainly if Of embeds si into 
98, then a corresponding S£ exists. (If *3f is not unital, then 
it is a decay quantum operation, but X can always be made 
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non-decay.) Generally, even when and 3$ are abelian, <3f is 
not an algebra embedding, but another argument of Choi and 
Effros [5, pp. 202-3] says that it always yields one. 

Theorem 2.3 (Choi, Effros). If \M is a finite-dimensional *- 
algebra and is an SUCP projection on then imcP also 
embeds (non-unitally) as a subalgebra of \M. 

Theorem l2.3l more generally holds for von Neumann alge- 
bras. The proof adjusts 3? in a canonical way. It is not hard to 
show that every algebra embedding is a blind, perfect-fidelity 
decoding & ; there exists an to match it. 

3. EMBEDDINGS 

3.1. Bin packing 

Besides embeddability and bulk embeddability, we will 
also compare memories using a partial ordering on partitions 
which resembles dominance 1 1 3l Ch.7], or majorization, but 
is stricter. The partition A supermajorizes the partition fi, or 
fl A, if for every n, the sum of all parts of A that are at 
least n exceeds the same sum for jx. Lemma l3~T1 below and 
Theorem 11.11 imply that supermajorization lies between em- 
beddability and bulk embeddability: 

stt^Sg =*> X{sf) ^ S X{38) a? ^38 

We can view the parts of a partition A as an unordered mul- 
tiset [Xu\- It is sometimes convenient to assume a specific 
order on the parts. In this case we follow the usual convention 
that the parts of A are non-increasing: 

Ai > A 2 > ••• > A„ > 1. 

Given a partition A, let X> x denote the sum of all parts of A 
that are at least x. Thus A =^5 pi means that 

X> x < H> x 

for all x. Obviously integer values of x suffices, but it will be 
convenient later to allow non-integer values. Also £X denotes 
A with each part repeated £ times. (This is not to be confused 
with magnifying each part by a factor of £.) 

In order to analyze bulk embeddings and Move Theo- 
rem ll.ll we first analyze ordinary embeddings [3]. If srf and 
38 are finite-dimensional ^-algebras, then any algebra homo- 
morphism / : stf — > 38 is characterized by a Bratteli diagram 
Y whose vertices are the summands of stf and 38. Let ^4 be 
the Mi summand of srf , so that si^ = ^x k , and likewise for 
38. If we denote the adjacency matrix of Y by Y as well, then 
the diagram's interpretation is that / embeds Yj % copies of &/j 
in 3§k- (The matrix Y is the adjacency matrix of the diagram 
r.) The matrix Y must satisfy the inequality 

j 



for all k. (Bratteli diagrams often describe unital homomor- 
phisms, which require equality.) The homomorphism / is an 
embedding if and only if each summand of stf has at least one 
edge, or equivalently that 

£r M >i 

k 

for all j. 

Thus we can think of as a set of 1 -dimensional blocks, 
38 as a set of 1 -dimensional bins, and the embedding as a way 
to pack the blocks of stf in the bins of 38. The packing might 
repeat some of the summands of , but if there is any embed- 
ding, there is one with no repetition. (Repetition in this sense 
has nothing to do with cloning as in the no-cloning theorem. 
In representation theory this kind of repetition is usually called 
multiplicity?) 

Lemma 3.1. If ^ 38, thenX{^) ^ S X{38). If2X(srf) ^ s 
X{3§), then a? ' ^ 38. 

Proof. Both statements follow by induction on the number of 
parts of A (jz/). They both hold trivially when A is empty. 
To prove the first assertion, suppose that in some embedding, 
jz/i embeds in 38^. Let stf be with srf\ removed and let 38 
be 38 with 38^ reduced by or removed if X(38)k — 

X 1 . By construction, g/ ^ 38. Thus by induction, 

X{^)> X <X{38) 

for all x > 1 . By the definition of g/ and 33, 

x (#f)> x = x (#/ ) > x - x 0/) 1 

X(S)> x <X(38)> x -X(g/)i 
for x < X 1 , while A (&/) > x vanishes for x > X (stf) 1 . Thus 
X{s#)> x <X{3$)> x , 

as desired. 

To prove the second assertion, suppose that 2X(s/) =^5 
X(38), or equivalently that 

2X(stf)> x <X(38)> x 

for all x. We can greedily put in any 38^ in which it fits 
and make stf and 3S as before. (In this greedy algorithm it 
is important to start with the largest summand of stf ' , not an 
arbitrary one.) If X{38) k < 2X{srf)i, then 

A {sf)> x = X(srf)> x -X 1 
A (S) > x >X{38)> x -2X{stf)i 

for all x < X(sf)i, while X(stf)> x vanishes for x > X{s#)\. 
On the other hand if X(38)i c > 2X(srf)\, then bin k remains 
larger than any block even after block 1 is subtracted. In this 
case 

X{srf)> x = X{srf)> x -X{srf)i 
X(3#)> X >X{38)> X -X{^) X 
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for all x < X(sf)i. Thus 

2X{sf)^ s X{^) 
either way, so the bin packing exists by induction. □ 

3.2. Large deviations 

The proof of Theorem 11.11 combines Lemma IXTl with the 
Chernoff-Cramer theorem on large deviations |6]. The the- 
orem is usually stated in terms of sums of independent ran- 
dom variables, but it is more convenient here to formulate it 
in terms of convolutions of measures. 

Theorem 3.2 (Chernoff, Cramer). Let pi be a measure on 
an interval [0, u], let 

£(P)=log[ e Px dp,(x) 
Jo 

be the logarithm of the Laplace transform of pi and let t > 0. 
Then for all n £ Z + and all j3 > 0, 



r 

Jnt 



dpL *n < jtlW-Pt) 



lf£(0) <t < u and minimizes 

then for all < s < t, 

d „*n> e n m -pt-ISs)( l _?W 
n(r-.v) \ ns l 



Here \l* n denotes the «-fold convolution of pi with itself. 
When £'(Q) <t<u, the expression 

I(t)=mm£(P)-pt 

is the Legendre transform of £(pj). Note that a unique j3 
achieves the minimum because the minimand is concave up, 
increases as j3 — » °°, and does not increase at j3 = 0. 

Proof. (Sketch) For any /3, 



dfl*"<e-"P' / e P x dpL m (x) 

111 JO 



This establishes the upper bound, Chernoff 's inequality. 

If j3 is chosen to minimize £(j3 ) - j3 1 , then t = £' (j8 ) . In this 
case 

dp.*" >e- n P {s+l ^ ( n(,+S) e P x dp* n (x) 
n(t—s) Jn(t—s) 

>e 



Jo \ (ns) 1 J 

= e -np(s+t) ( j _ V'{$) \ c nim 

\ ns 1 J 



The equality uses the identities 

\e^ x dp* n (x) = (e" e ^)' = n£'(P)e<V 
x 2 eP x dp*"(x) = (e ne ^)" = (n£"(P) +n 2 £ , (j5) 2 )e" e ^ . 



This establishes the lower bound, Cramer's theorem. □ 
Proof of Theorem \l.l\ In brief, without loss of generality 

PK)|| P <P(^)|| P 

for all p € I n tfl i s case we a PPly Theorem 13.21 to the 

measures 



where 8 X denotes a delta function (or atom) at x. For suf- 
ficiently large n, Chernoff's bound for p^ and Cramer's in- 
equality for pgg together imply the criterion 

2X{srf® n )> x <X{@ m )> x 

of Lemma lTTl uniformlv for x 6 [1 , °°). 

In detail, we assume that ||A(J?)||oo > 1; otherwise srf and 
88 are both entirely classical and Theorem ll.ll is easy. Since 

\\i{^)\\ P <\\xm\ P 

for all p £ [1,°°], then for any k > 1, 

\\X(^ k )\\ p <\\X(^ k+1 )\\ p . 
The e margin in Theorem ll.ll thus allows us to assume that 

wx(^)\\ p <\\xm\ P 



for all p e [1, °°] by replacing &f by s^® k and 3§ by &® k+1 . 
The measure p^ is defined so that 



— Mi/®" 



X{sf)>& = I dp^(x) 



and 



and likewise for pag. Therefore by Lemma I3TT1 it suffices to 
show that there exists an n such that for all t > 0, 



2 / dp™ < / dpq. 

Jnt Jnt 

As in the statement of Theorem l3.2l let 

e*(fi) = log f e^dpA*) = logllMOII 



(1) 



+1 

iS+1 



'V/3; log / eP'dfiatx) =lDg||A(^)||J+}. 



6 



Observe that £,gg(f5) is a smooth, concave function, and that 

f (R) 

/ 3 1 ™^ =l0g||A( ^ )l|oo< ^ 

It follows that £%(P) has a finite maximum C for j3 G [0,°°) 
Note also that 

£<m(P)-£AP) 



achieves a positive minimum, since 



lim 

iim ^(j3)-^()3) 



||A(«)|U-||A(^)||. 



Temporarily suppose that f > ^»(0) and that j3 = j3 (f ) min- 
imizes £.cg(fi) — jit. Let 



Then 



n(t-i) 



rfuS? > e "(^(/3)-/3f-/3^)-log2 

ln{t—s) 

If n is large enough that 

21og2 /2C 21og2 . Iqipy-tjAP) 
2s H = 2\ / 1 < mm . 



n n 



then 



2 / dpQ < / dpS. 

Jn(t-s) Jn(f—s) 

Thus for some e > 0, inequality 10 holds for all t > l'^(0) — E. 
If t < £'#(0) - e, let u = f^(0) and let j8 = 0. Then 

Jnf JO 



while 



J/7r Jn(u—s) 



*" > ^«(°)- lo g 2 



(w— s 

provided that s < £. Since ^(0) < ^(0), inequality ([0 
holds when n is large enough. □ 



4. ENTROPY 
4.1. Capacity 

Let ,c/ be a finite-dimensional *-algebra, where as before 

n n 



Let p be a (mixed) state on srf; as explained above we view p 
as a dual vector on rather than as an element of . Let 

Pk = P u 

be the restriction of p to s^. Diagonalize each p,t and let r^j 
with 1 < j < Xt be its diagonal entries. (In general a state p 
on matrices is diagonal if and only if p(A) depends only on 
the diagonal entries of A. Equivalently in the present case we 
can interpret p as a density operator.) Let 

h 

rk = Pk{I) = £ r kJ 
be the total density of p in ^4; evidently 

n 
k=l 

We also define the normalized state p' k on jz4 by 
with diagonal entries 

The classical entropy of the state p on ^ is defined as 

fl>(p) = r^log r*. 
fc=l 

The quantum entropy of p is defined as 

n X k 

SAP) = ~Y, E r <U lo 8 r 'kj- 

k=\j=l 

(Note that in the literature H is also sometimes used to de- 
note quantum, or von Neumann, entropy. Here we follow the 
convention of Nielsen and Chuang 1 10].) These two entropies 
are supported by a number of elementary justifications: The 
classical entropy of p is the Shannon entropy of the restric- 
tion of p to the center of srf , which is a classical system. The 
quantum entropy of p is the expected value of the von Neu- 
mann entropy of p^, where the index k is chosen randomly 
with probability />. Finally the total entropy 



^Mp)+Mp) 



n h 

EE r 'J lo S r kJ 



k=\j=\ 



k=\ k=\ 



has the same formula as both the Shannon and the von Neu- 
mann entropy. 

The proof of Theorem ll.2l is based on finding thermal states 
of stf with respect to a certain Hamiltonian. We define the 
energy of the summand ^ as the negative of its capacity 
for quantum entropy: 

Ek = -logAt^). 
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We retain the parameter j3 from Section EOl setting p = j3 + 1, 
and we also define the temperature T = 1 //3 . The thermal 
state Pt at temperature T has the property that its restriction 
Pk to each si^ i s uniform. If p is any state with this property, 
then its energy E^{p) is, by definition, the negative of its 
quantum entropy: 

E*(p) = -Mp). 
The free energy of p is therefore 

iv(p) = £^(p) - t(hAp)+sAp)) 

= -T{H Q/ (p)+pS^(p)). 

Since the thermal state minimizes the free energy, we have de- 
fined energy so that the thermal state pr maximizes quantum 
entropy plus classical entropy discounted by p. To compute 
the maximum, recall that for the thermal state pj, the free 
energy is proportional to the log of the partition function: 

fApt) = -nog z,/( Pr ) = -rio g V"* 
= -riog£Af +1 = -7>io g POOH,. 

k=\ 

Therefore 

^M +Mpr)= i og p(^)n P , 

as desired. 

To prove the final claim of Theorem ll.2l observe that every 
point in C(si) can be written in the form 

{H^(p T )+t,S^(p T )-s-t) 

with < s,t and s + t < S iC /(pr)- Starting with the state p T , 
the quantum entropy in each block can be decreased to with- 
out changing the total probability of that block, hence without 
changing the classical entropy. In this way we can absorb the 
constant s. The remaining constant t just matches the one in 
the conclusion. 



4.2. Noiseless coding 

A final justification for quantum and classical entropies is 
Theorem ll.3l which we prove here. The theorem is a mutual 
generalization of, and entirely analogous to, Shannon's clas- 
sical and Schumacher's purely quantum coding theorems II UL 
Thms. 12.4 & 12.6] 00- 

Given an algebra si with a state p and a second algebra 38, 
a noiseless coding is a pair of decay quantum operations 

si > 38 » si ' 

Since these are maps on algebras rather than states spaces, the 
second map S£ is the encoding and the first map W is the 
decoding. 



We are interested in reliable noiseless coding, or in other 
words high-fidelity, visible bulk-encoding. But a rigorous def- 
inition of reliability is not obvious. Suppose that S is a decay 
quantum operation from a memory si to itself, and that si has 
a state p. If c € is another memory, we define the 'io -fidelity of 
si to be 

Fr(fi,*)= min 1-D(a,(id.<g) JT r )((7)), (2) 

£76(^8^)* 

where D is the trace distance on states, and the minimum is 
taken over states o on c ta ® si that project to the state p on 
si. In words, the ^-fidelity is the complement of the high- 
est probability that the operation 3C leaves the larger system 
'lo ®si in an erroneous state. We define the complete fidelity 
F(p,S') to be the infimum of "^-fidelity over all ^ . It is not 
hard to show that complete fidelity agrees with the classical 
non-error rate when si is classical, and with entanglement fi- 
delity when si is purely quantum. 

The more difficult half of Theorem ll.3l is the no-go direc- 
tion. To review, the heart of the no-go direction of the classi- 
cal encoding theorem is the following elementary fact about 
squeezing states: If a state p of a classical memory is en- 
coded into b values, then it cannot be recovered with proba- 
bility greater than fe||p||oo, where \\p\\«, is the probability of 
the most likely value of p . Or for simplicity, if p is the uni- 
form state on a memory with a values, then the non-error rate 
is at most K We will need a hybrid quantum generalization 
of this inequality. To state it, we replace ||p||«, with a differ- 
ent norm. If p is a state on si, define the dense-coding-based 
supremum of p by 

r 2 

IPlld^max^- 

J,k r k 

in the notation of Section l4~T1 

Theorem 4.1. Let si and SS be two hybrid quantum memo- 
ries and let p be a state on si . If 

si > 38 * si 

are decay quantum operations and ~ + - = 1, then 
F(p )t ro^)<||p|| d ||AK)|| ? ||A(^)|| p . 

Before proving Theorem 14.11 we discuss some special 
cases. If si = C" is classical and p is the uniform state, then 
||p||d= \- In m is case, taking p = 1, Theorem l4.1l savs that 

F(p,^)<^». 

This generalizes the classical squeezing result, bounding the 
fidelity by the total number of independent states of 38 
whether or not it is classical. On the other hand, if si = sH a 
is purely quantum and p is the uniform state, then | |p | | d = \. 
In this case, taking p = °°, Theorem l4.1l savs that 
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In other words, if s$ is purely quantum, then the fidelity of 
squeezing is bounded by the largest quantum block of S3, 
regardless of its classical capacity. (But if S§ = is also 
purely quantum, then it can be shown that 

b 2 

F{ Pl 5£o<&) < 

a L 

when p is inform. In this case if b divides a, then multiplying 
SB by r classical states can boost fidelity to |.) 

Proof. The operations 2£ and W admit Kraus representations 

* j M 

k j k,e 

subject to the subunital conditions 

k,e 
k,e 



Thus 



(3) 



Recall the definition of r k and p' k in Section B~TI The mini- 
mum in equation (0 is obtained by lifting the state p to the 
completely correlated, completely entangled state 

k 

on s/ ® sd ', where y k is a pure state that projects to pL By 
a computation similar to one in Nielsen and Chuang llOt p. 
421], the fidelity is then given by 



F=F(p,^o^)= £ r k \p' k (Yj, k , m X kJ/ 

j,k,i,m 



(4) 



Given any state a on the matrix algebra and any matri- 
ces X € .-#ixa an d Y £ ^axb, the Cauchy-Schwarz inequality 
and positivity together say that 

\a(YX)\ 2 < a{X*X)a{YY*) < \\a\\i Tr(X*X)Tr(Y*Y). 

Applying this to equation 0, we obtain the bound 

F< £ IIPlldTr(X| ); .^)Tr(F^ m F Mim ). (5) 

Define the numbers 

x kj = L Tr (^j,A;,«) yj,k ^Y. TT O r j,k.,',i X j,k.m) 

i m 

and define the vectors x = (x k j) and y — (y j). Then we can 
restate inequality (0 as 



F < 



j,k 



\P\\dX-y, 



while equation (|3j implies that 



\\x\\ p <\\X(s/)\\ p |M|,<||A(#)||,. 



Finally the Holder inequality yields 



F < 

< 



dx-y < ||p||d \\x\\ P \\y\\ g 
d \\X(^)\\ p \\X(SB)\U 



when - + - = 1, as desired. 

p q 



□ 



Proof of Theorem u .31 (Semi-sketch) As in the proofs of 
Shannon's and Schumacher's theorems as presented by 
Nielsen and Chuang [10], we first establish the existence of 



a e-typical subalgebra S2? tyv of si® with respect to the state 
p. (The e in the proof here is not the same as the one in the 
statement of the theorem, which we rename 8.) We will take 
e to implicitly depend on N with e — * slowly as N — * °°. We 
will establish that j^typ is approximately rectangular and that 
the restriction pty P of p® . We will then confirm that if 

{S,H) = (S^(p),H^(p))eC(SB), 

then s/n embeds in SB® N for sufficiently large N; in particu- 
lar it reliably encodes. On the other hand, if (S,H) g C(SB), 
we will confirm that si^ does not reliably encode in SB® \ 
indeed the fidelity of any encoding-decoding converges to 
exponentially. 

Assume that the state p on si is diagonalized and that r k j, 
with 1 < k < n(s/) and 1 < j < X k (si), are its diagonal en- 
tries. Here n(s/) denotes the number of parts of X{si). This 
induces a diagonalization of the state p® N with a diagonal en- 
try r K,j for each pair of admissible sequences 



K= (k u k 2 ,...,k N ) J = UiJ2,---,Jn) 



is such that 



1<^<AT 



\<k t <n(si) 1 < j < X k( {sf) . 



Moreover, for each admissible K, si® has an algebra sum- 
mand {si m ) K - If (K,J) and (K,f) are two admissible 
pairs, the algebra summand (jz^® )jf has an elementary ma- 
trix E K J ji; these matrices then form a basis of s/® N . We will 
consider a set T of admissible pairs (K,J) called the typical 
set; momentarily it can be any set. The span of the matrices 
E K J ji with (K,J), (K,J') 6 T is a subalgebra s/ lyp . Another 
way to describe the algebra s/ typ is to define the projector 

^typ = E E k ,jj 

(K,j)eT 



and then let 



^yp^Ayp^Ftyp. 



In this notation, the map 

^(X)=F typ XF t yp 

is an SUCP projection on s/® N with image si tr9 . 
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Given a > 0, say that an admissible pair (K,J) is a-typical 
if the number of occurrences N(K, J; k, j) of (k,j) satisfies 



N(K,J;k,j)-r, 



N 



< a. 



Let T be the set of all a-typical pairs. By repeated application 
of Chernoff's inequality (Theorem 13.21 in a more traditional 
probabilistic context), 



>®Vtyp) 



(Kj)eT 



1 



for any fixed a as N — » °°. Moreover 

F(^ typ ,p)>p«Vtyp) 2 , 

so for any fixed a, and .g/®" reliably encode into each 
other. At the same time, by a messy but straightforward calcu- 
lation, if a is sufficiently small relative to e (and depending on 
p but not on N), s/ tyv and p typ have the following properties: 



| (log n{jrf tyv ))-HN\ <Ne 
| (log k(j4yp) K )-HS\ <Ne 
(log \\p d \\)+H + 2S<N£. 



(6) 



Suppose that (tf,S) 6 C(^). In this case, let C = e Ar ( 5+£ ); 
then 

A(^yp)>C=0. 

Meanwhile equations imply that 

A(^, e )>o<e" ( " +s+2e) . 

By a derivation using Cramer's bound like the one in the proof 
of Theorem ll.il 

when N is large enough, provided that e is small compared to 
8. Thus by LemmaO si N , e embeds in 3§®N{i+S) for large 
enough N, as desired. 

Suppose that (H,S) £ C{38). In this case, suppose that 



si 



typ 



2®/V(l+5) 



.r 



typ 



are decay quantum operations and that 
two equations of 0, 



= 1. By the first 



,H 



log \\X{^ tyv )\\ q <{-+S+2e)N. 

q 

Combining this with Theorem l4.1l and the last equation of (jfji, 
we obtain Theorem l4.ll 

log F(p typ , SCoW) < (e-H-2S)N 

+ (-+S + 2e)N + log \\X{38 m{l+S) )\\ p 
1 

= JV((l + 5)log ||A(^)|| p ---5 + 3e). 



Since 8 must be sent to and e may be sent to 0, the fidelity 
therefore decays exponentially if there exists a p such that 

log \\X(38)\\ P <-+S. 

By the definition of C{38), this inequality is equivalent to the 
assumed condition (H,S) £ C{38). Since F(p tyv , Jf o W) de- 
cays exponentially, it cannot converge to 1 . □ 



5. DISCUSSION 

Section|2]illustrates the principle that classical information 
theory is the abelian special case of quantum information the- 
ory. Many authors maintain a dichotomy between the two 
theories by considering ensembles of mixed states. But such 
formalism is ultimately redundant, because an ensemble is it- 
self a classical probabilistic state. More precisely, let 

P = Y,PkPk e si 

k 

be an ensemble of states in a memory sd '. If the symbol k 
is not recorded, then p encodes all statistical information that 
can be extracted from the ensemble. But if each symbol k is 
recorded as a state in another memory 38, then we can let 

P' = Y^PkPk ® °* S si ® 38. 

k 

If 38 is abelian and the Cfy's are distinct pure states, then the 
state p' denotes an ensemble with a record of its preparation. 
The term "ensemble" also typically implies that the memory 
38 is hidden or untransmitted. This too is only a special case, 
because memory may be hidden whether or not it is abelian. 

Theorems ll.il 11.21 and [O] together suggest that all quan- 
tum information can be measured in the bulk limit by two 
numbers, classical entropy H and quantum entropy S. By con- 
trast information capacity has more structure than informa- 
tion itself. The capacity of a quantum memory is defined by 
a curve that represents trade-offs between classical and quan- 
tum entropy. The capacity of a general quantum channel could 
be even more complicated. 

There are many interesting partial orderings on quantum 
memories besides embeddability, bulk embeddability, and su- 
permajorization. One natural example is embeddability in 
the presence of an auxiliary memory, or stable embeddabil- 
ity. Given memories si and 38, when is there a memory ^ 
such that 

s/®^^,38®^. 

We do not know when si stably embeds in 3$. Stable em- 
beddability implies bulk embeddability and is implied by em- 
beddability, but we do not know how it compares to superma- 
jorization order. 

Theorem ll.ll is related to a much more general question in 
quantum information theory. Let S : si — > 38 and & : c € — > 3 
be quantum operations representing two quantum channels 
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between general quantum memories. When are there oper- 
ations j?7v and W$ that make the diagram 

> m m 

* j?®Ar(l+e) 

commute with high fidelity? We can then say that the chan- 
nel S reliably bulk-encodes in the channel & . Theorems ll.il 
11.21 and ^3] together answer the question when S and & are 
both the identity map, with the refinement that perfect fidelity 
is possible when high fidelity is possible. In light of Theo- 
rem |^] the cross-encoding question is also settled when S 
and & are SUCP projections. 



Finally, it is well-understood that classical and quantum 
memory are inequivalent resources in quantum complexity 
theory. For example there is a quantum algorithm to find a 
collision of a 2-to-l function with which uses 0(N 1 / 3 ) classi- 
cal space (and 0(1) quantum space) \2\. But if the function 
only has a single repeated value, the best quantum algorithm 
uses 0(N l l A ) quantum space Q. It would be interesting to 
find an algorithm whose natural space complexity is hybrid 
quantum memory. 
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